Skip to main content

Databricks Cost Exports with nOps Platform

To integrate Databricks with nOps BC+, follow these steps:

  1. Enter S3 Bucket Details
  2. Deploy CloudFormation Stack
  3. Schedule Databricks Job

Accessing Business Context Integrations

To begin, navigate to the Organization Settings and click on Integrations. From there, select Business Contexts to proceed with setting up your Databricks integration.

Below is an example of the integrations page:
Business Context Integrations Interface

This page provides access to configure and manage integrations with your Business Contexts tools.

The list of integrations will indicate whether there are any active integrations or if the tools are not yet integrated. Active integrations will be marked accordingly, allowing you to easily identify the current status of each integration.

Step 1: Enter S3 Bucket Details

Integrate Databricks with Business Contexts

To configure the S3 bucket for nOps file uploads, follow these steps:

  1. Select the Correct AWS Account

    • From the dropdown, select the AWS account where the S3 bucket resides.
  2. Enter Bucket Details

    • Bucket Name: Enter the name of your existing S3 bucket or create a new one.
    • Prefix: Specify a unique prefix (e.g., nops/) to ensure nOps only accesses files meant for the integration.
    important

    If you already have an S3 bucket configured for Databricks to write files, it’s recommended to use that bucket to avoid additional setup steps on Databricks.

    • If you don’t have an S3 bucket configured this way, follow these setup instructions to create the bucket and set it up in Databricks.
  3. Save the Configuration

    • Once the account, bucket name, and prefix are entered, click Setup to store these details.
  4. Redirect to CloudFormation Setup

    • After clicking Setup, you will be automatically redirected to create a CloudFormation stack for granting necessary permissions.
    important

    Make sure to be logged in the account selected to proceed with the following step.


Step 2: Deploy CloudFormation Stack

  1. On the redirected page, the parameters should be prefilled.
    • Click on the checkbox to acknowledge the creation of IAM resources and click Create stack User Acknowledgement of Automated Resource Creation
note

Ensure the stack is deployed successfully. This step is crucial for nOps to access your data.


Step 3: Schedule Databricks Job

After the steps above, you should see something like this in your Databricks integration Overview of the Databricks Integration

  1. Generate Script:
    • Click the Generate Script button to obtain the billing extraction script.
    • Use the Copy button (shown below) to copy the script to your clipboard for use in Databricks.

Databricks Billing Extraction

tip

Ensure you copy the entire script accurately, as it contains the necessary configurations for Databricks to upload billing data to the S3 bucket.

  1. Create a Notebook:

    • Log in to your Databricks workspace.
    • Navigate to the Workspace section.
    • Click the Create button and select Notebook.
    • Name your notebook (e.g., NopsDatabricksBillingDataUploader) and choose the appropriate language (Python).
    • Copy and paste the script into the notebook.
  2. Schedule the Job Directly from the Notebook:

    • Click on Schedule in the top-right corner of the notebook toolbar.

    • In the Add Schedule dialog:
      Scheduled Execution of Databricks Notebooks

      • Set the frequency to Every 1 day.
      • Select the appropriate compute (cluster) for the job.
      • Click More options
        • Add +Add
          • Enter role_arn on key and enter the role_arn in the value field

      Dependencies of Scheduled Jobs in Databricks

    • Click Create to finalize the schedule.


When Will Data Be Available?

Once the integration is configured, data will begin flowing into your Business Context tools. It may take up to 24 hours for the initial data synchronization to complete. During this period, nOps will fetch and process the necessary data from the S3 bucket to ensure accurate insights and functionality.

If data is not visible after 24 hours or you encounter any issues, please reach out to our support team for assistance.